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SUMMARY 


The  research  presented  here,  although  motivated  by  the  single 
theme  of  finding  the  channel  capacity  of  a  discrete  memoryless  channel 
for  codes  other  than  block  codes,  is  divided  into  tvo  essentially 
independent  sets  of  results* 

First,  in  section  II  a  general  framework  for  encoding  and 
decoding  is  presented  which  includes  block  coding*  The  key  concept 
used  with  the  generalized  codes  is  that  of  decoding  rate*  A  weak 
converse  is  proven  using  decoding  rate  which  shows  that  channel 
capacity  for  the  generalized  codes  is  the  sane  as  the  usual  block 
coding  channel  capacity  C  far  a  discrete  memoryless  channel* 

Second,  in  section  HI  uniform  finite-memory  codes  are  defined 
from  the  general  framework  after  several  motivating  definitions  of 
properties  which  seem  natural  to  require  of  any  code*  Channel 
capacity  CQ  is  defined  for  these  codes  but  what  its  value  is  remains 
an  open  question*  A  class  of  channels  is  given  for  which  CQ  is 
nonzero  for  each  member  of  the  class.  From  the  converse  in  section  II 
it  is  known  that  Cq<C* 


I.  NOTATION  AND  PHELI1GNART  FACTS 


For  any  set  A  and  any  positive  integer  n ,  An  denotes  the  set  of 

all  n-tuples  of  elements  from  A.  fr*i  denotes  the  set  of  all  sequences 

i»-<# 

( . .  •xr_^,xy)  of  elements  from  A  while  Jr  denotes  the  set  of  all  sequences 
x-  CzpZ2»>>>)  of  elements  from  A.  If  w€  An  then  w(i)«  x^  for  Ww 

and  lsisn.  Similarly,  x(i)«x^  for  x€A*.  For  A  a  finite 
set,  I  At  denotes  the  number  of  elements  in  A  and  <KA*)  denotes  the 
(Nfield  of  subsets  of  A*  determined  by  cylinder  sets* 

A  discrete  memory less  channel  (HC)  is  a  triple  (B,S,p)  where 

(i)  B  is  a  finite  set  of  elements  called  inputs, 

(ii)  §  is  a  finite  set  of  elements  called  outputs, 

(iii)  p  =  p(*/‘)  Is  a  function  on  B*B  such  that  p(*/y)  is  a 
probability  distribution  on  B  for  each  y€B,  and 

(iv)  for  each  positive  integer  t  and  for  all  sequences 

p(yi>**‘yt/yi»***yt)  = 

frp(yi/yi>- 

The  n-extension  of  (B  .B.p)  is  the  D1C  (Bn,Bn,q)  where  n  is 
a  positive  integer  and  for  each  (y^, •••yn)  €  Bn,  (y^,.,.yn)€  B°, 

fTpCy^). 


A  source  is  a  sequence  (x^,  i«  1,2,...]  of  finite-valued  random 
variables  which  are  independent,  identically  distributed  with  a  uniform 
distribution.  In  section  III  a  source  |l^,  -®*«i«(®}  will  be  used. 
Frequently  it  will  be  helpful  to  think  of  the  subscript  i  as 
corresponding  to  time* 
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Throughout  this  report  all  random  variables  are  finite-valued 
unless  otherwise  noted.  They  will  be  denoted  by  capital  letters 
X,Y,Z, . . .  and  their  values  by  small  letters  x,y,z,...  .  The  ranges 
of  X  and  Y  will  be  denoted  by  A  and  B  respectively,  where  corresponding 
affixes  are  used  when  necessary.  For  example,  Y  (?)  will,  with  a 
subscript  to  distinguish  order,  denote  an  input  (output)  random 
variable  to  the  channel  (B,ff,p) .  Far  any  random  variable  Z, 
p(z)  will  denote  p[z  =  z],  the  probability  that  Z»z. 

For  finite~valued  random  variables  U  and  V  the  numbers 


are  called  the  average  uncertainty  of  U  and  the  average  uncertainty 
of  D  given  V  respectively.  J(U,V)  =  I(U)  -  I(UA)  is  called  the 
average  mutual  uncertainty  of  U  and  V.  All  logarithms  used  will  be 
to  the  base  2. 

For  a  probability  distribution  p(*)  on  B  of  (B,Ef,p)  with  p(y,S0  = 

p(y)p(y/y)  and  p(y)*£p(y,?)  for  y«B»  y«B, 

y 

C  «  sup  J(Y,?)  (1 

p(‘) 

is  called  the  block  coding  channel  capacity  of  (B,S,p).  A  standard 
result^-  for  the  DMC  (B,f),p)  will  be  used  without  comment.  Let  k 
and  n  be  positive  integers  and  let  be  a  sequence  of  random 

variables, all  with  the  same  range  A.  Then  for  any  finite  collection 
of  functions  {frj  =  |frtAk-*,BnJ  and  any  probability  distribution 


p(x1?. ».xk,r)  for  the  random  variables  and  the  function 

f  (a  random  variable) 

J  ^(X1,...Xjf),C?1,<,..?n))  snC  (1.2) 

n 

where  p(xj_, . .  xkjy1, . .  .yn)  =  £p(xl5 . .  .x^rJTTp^./y.)  and 

r  i«l 

(yri>*»*yrn) "  fr(x1,...xk). 

p 

One  result  used  often  in  the  sequel  is  Fano's  Inequality  . 

For  two  arbitrary  random  variables  X  and  Y  (not  necessarily  a  source 
output  or  channel  input)  the  value  of  X  is  decided  on  from  the 
occurrence  of  Y  by  a  function  gsB-~A  (a  decoder  in  the  terminology 
of  this  report)  o  The  probability  of  error  P[g(Y):£X]  for  any  g 
is  related  to  the  average  uncertainty  of  X  given  Y  by  Fano's  Intx^ualityi 
l(X/Y):sh(P[g(Y)*x])  +  P[g(Y)*x]loclAI 

where  h(x)=~>:logx  -  (l-x)log(l-x) ,  0<x<l,  and  h(x)  =  0  if  x  =  0,l. 


(1.3) 


II.  GENERAL  CODING  FRAMEWORK 


In  this  section  general  codes  are  considered  which  (i)  map 
infinite  source  sequences  (x]_,x2s » . .)  into  infinite  sequences  of 
channel  inputs  (yi,y2s  <> » •)  and  (ii)  map  infinite  channel  output 
sequences  (y^ ,y2 > ° « 0 )  into  infinite  decoded  source  sequences  >  °  °  °) 

in  ways  other  than  breaking  up  the  sequences  into  independent  blocks 
as  usually  done  by  block  coding.  The  main  result  of  this  section 
is  Theorem  1  which  proves  that  channel  capacity  for  these  general 
codes  is  the  same  as  the  channel  capacity  for  block  codes.  All 
channels  considered  are  discrete  memoryless  channels  o 

To  begin  this  section  the  basic  facts  of  block  coding  are 
given.  A  standard  way  of  sending  a  source  output  [X^J  over  a  DMC 
(B,S,p)  is  by  block  coding.  A  function  fsA^-Bn  (k  and  n  are  positive 
integers)  called  the  encoder  maps  (encodes)  k-tuples  of  source  outputs 
into  n-tuples  of  channel  inputs  as  given  by 


L  jj)  o  o  o 


(Yjn*lsoeoY(j+l)n)~  f(Xjk+asoooX(j+l)k)» 

From  the  channel  outputs  a  function  gs§n-^A^  called  the  decoder 

maps  (decodes)  the  channel  outputs  into  source  symbols  as  given  by 

(Xjk4-ls°ooXCj+l)k)  =g(Yjn+-l-',oooY(j+-l)n)s  3*0,1,..  . 

The  goal  of  the  code  (f,g)  is,  of  course,  to  have  ^jk+l»°‘,0^(j+l)k)”"‘ 
^X . .1^ 3+1)^ j  with  high  probability,  3=0,1,..,  0  The  diagram 
of  Figure  1  illustrates  the  block  coding  relations. 

For  all  j  =  l, 2, so.  the  Joint  probability  distribution  of  source 
outputs  and  channel  outputs  is  defined  by 


(2.1) 


(2.2) 


4 


source  outputs 


^.•Vk+r"»x2kf  •  *  fjk4i,'**I(j»i)k|>  *  *  • 


h  *  -  -rJYn+r  X2n'**  J  Vl  *  *  **Y(  J+Dn*  *  *  *  channel  inputs 


f 1* • • **n>?ntl»  * *  *?2n»  *  *  *  *  **( j+l)n 


f* 


.  •  »*k»*ieH>  •  •  **2k » *  *  ^jk+l*  *  *  •*(  j+l)k^*** 


channel  outputs 


decoded  source  outputs 


Figure  1*  Block  Coding  Relations  for  a  Block  Code  (f  ,g) 


p(x1,,.».Xjk;y1?...yjn)=  1  T^pCy^j)  where  (2 

|A| 

(yrntl»—y(r+l)n)“f(xrk+ls***x(r+l)k)»  r  =  0,...j-l.  Thus  blocke 
of  k -tuples  from  the  source  are  encoded  and  decoded  independently 
of  other  blocks  so  that  P[(ljictl»,#,^(j+l)k)^:(^jk+l»#**^(j+-l)k)]  *8 

independent  of  j.  e(f,g)=  p[(X1,...Xjc)^(x1j,...X^)J  is  called  the 
probability  of  error  for  the  block  code  (f ,g) « 

R=k  log|AI  is  called  the  rate  of  the  block  code  (f  ,g) .  As 
n 

the  average  number  of  bits  per  channel  input,  it  measures  the  density 
of  source  outputs  per  channel  input.  Prom  the  decoding  viewpoint 
(the  viewpoint  which  is  useful  for  general  codes  used  later)  R  is 
the  average  muriber  of  bits  decoded  per  channel  output. 

The  block  coding  channel  capacity  C  of  a  DMC  (B,1),p)  has  the 
following  property 

C.sup  Jr'»  inf  {e(f,g)»  k  loglAltR']  =  0  V  (2 

l  (f,g)1  n  J  J 

which  is  just  a  result  of  the  usual  coding  theorem  for  a  DMC  (B,B,p)3 

From  the  observation  that  there  are  many  conceivable  ways  other 
than  block  coding  to  send  a  source  output  (x^,x2.«)  over  a  channel, 
the  question  naturally  arises  as  to  whether  (i)  a  code  rate  can  be 
defined  for  a  general  class  of  codes  (for  which  channel  capacity  is 
defined  for  some  suitable  error  criterion)  and  (ii),  if  so,  do  block 
codes  attain  the  channel  capacity  of  the  general  codes?  To  begin 
answering  (i)  and  (ii)  a  general  class  of  encoders  and  decoders  is 
first  defined. 
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One  of  the  properties  that  any  practical  encoder  shoald  have  and 
which  a  block  encoder  does  have  is  that  the  n*'*1  channel  input  depends 
on  only  a  finite  number  of  source  outputs.  Similarly ,  a  decoder 
should  decide  on  the  k*"*1  source  symbol  after  some  finite  number  of 
channel  outputs.  Therefore  the  following  definitions  are  made. 

An  encoder  for  (Xj)  and  (B,S,p)  is  a  sequence  of  functions 
jfn:A^n^~*Bj  where  {K(n)J  is  a  sequence  of  positive  integers  and 
fn  determines  the  n™1  channel  input  from  the  first  K(n)  source 
outputs.  Since  there  are  many  different  sequences  {K(n)j  which 
could  be  used  for  the  representation  of  the  same  encoder  it  is 
assumed  that  K(n)  is,  for  each  ns  the  smallest  positive  integer  r 
such  that  the  first  n  channel  inputs  depend  on  at  most  the  first 
r  source  outputs.  With  this  assumption,  if  the  source  subscripts 
of  correspond  to  time  in  seconds,  then  K(n)  is  the  earliest 

time  in  seconds  that  the  channel  input  could  be  sent  where ,  of 
course,  the  n*'*1  channel  input  is  not  sent  before  the  (n-l)*1*1  even 
though  it  may  be  determined  by  earlier  source  outputs. 

The  block  encoder  fsA^— •Bn*  is  obviously  a  special  case  of 
the  encoder  defined  above.  With  the  general  encoder,  however, 
the  n^1  channel  input  can  depend  on  the  whole  source  output  x^,...xg(n) 
while  the  channel  input  of  the  block  encoder  can  depend  on  at 
most  the  source  outputs  i+i, . .  °x( j+l)k*  where  j  satisfys  Jn' 

<«n s(j+l)n* o  In  addition,  the  functions  fn  which  correspond  to 
the  block  encoder  fiA^'— *'Bn'  are  periodic  with  period  n*. 
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A  decoder  for  and  (B,5,p)  is  a  sequence  of  functions 

,)  where  {N(k)j  is  a  sequence  of  positive  integers  and 
gjj  determines  the  k*'*1  decoded  source  output  from  the  first  N(k) 
channel  outputs •  As  with  the  sequence  [K(n)j  for  the  encoder, 
it  is  assumed  that  N(k)  is,  for  each  k,  the  smallest  integer  r 
such  that  the  first  k  decoded  outputs  depend  on  at  most  the  first 
r  channel  outputs.  Remarks  comparing  a  block  decoder  with  the 
decoder  defined  here  are  similar  to  those  made  for  the  encoders. 

A  diagram  illustrating  the  general  encoder  and  decoder  is  shown 
in  Figure  2. 

Throughout,  an  arbitrary  but  fixed  source  [XjJ  and  DMC  (B,B,p) 
will  be  understood  if  not  explicitly  stated.  For  the  sequences 
^K(n)]  and  |N(k)]  understood,  an  encoder  and  decoder  for  fX^j  and 
(B,B,p)  will  be  denoted  by  [fn j  and  jg^J  respectively.  The  pair 
(jM  1  {?k})  k®  called  a  code. 

For  an  encoder  [fn]  the  following  probability  distribution 
between  source  outputs  and  channel  outputs  will  always  be  assumed* 
For  each  positive  integer  t,  for  each  and  (y^,o..yt) 

p(x1,...xK(t)5yisoo.^t)=  1  K(tjTTp(yiAi(xi°°oXK(i)))‘ 

jAj  i- 1 

Because  the  probability  distributions  are  consistent  in  t,  that  is, 

y*  P(xl» ° ° “xK(t+l) ° • 07t+l)~  P(xi»...*K(t)J7l»--yt) 
xK(t)+l»  *  *  ,xK(t+l)  *^t+l 


(2.5 


for  t  =  l,2,...,  the  marginal  probability  distributions  p(x^,...xi;y^,..yj) 
are  determined  for  all  i, j  (assume  K(t)-^co). 
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Figure  2* 


General  Encoding  and  Decoding  Re la 


({fn)>(&k})  with  K(2)-K(3). 
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source  outputs 


channel  inputs 


channel  outputs 


decoded  source  outputs 


for  a  Code 


For  a  code  ({^n}  *  {gic})  the  channel  Input  random  variables 
are  defined  by 

Tn  =  fn(Xl»”°‘XK(n))  n=l,2,... 

and  the  decoded  source  random  variables  by 

Xk  =  gk(Xl»<,ooXN(k))° 

For  each  k,  k log|A|  is  the  average  number  of  bits  decoded 

NTkT 


per  channel  output  from  the  first  N(k)  channel  outputs*  The  rate 

R3  of  the  decoder  is  defined  by 

R-.  =lim  inf  k  log|AI 
u  k  *mrr 


is  defined  by 


(2*6) 

(2.7) 

(2.8) 


which  is  Just  the  average  number  of  bits  decoded  per  channel  output 

when  the  above  limit  exists.  For  a  block  decoder  -A^',  N(k)  = 

N(i)+Jk'  for  k  =  i  +  Jk',  Odsk1,  j  =  0,l,2,...  so  that  R^ 

lim  k  log|AI  =  k'loglAl ,  the  usual  rate  for  a  block  code  (f,g) 
R-®  RTYT  n* 


With  a  block  code  (f,g)  the  error  criterion  which  was  used 
was  the  probability  of  error  for  a  block:  j<-l)k') ^ 

(Ijjfi+l*.**^ j+.^)k»)J  =  e(f,g).  For  a  code  ({fn}»{gk})  the  error 
criterion  used  will  be  the  average  coordinate  probability  of  error 


In  particular ,  the  interest  here  will  be  in 


what  happens  to  5^  as  t— oo.  Note  that  for  a  block  code  (f,g), 
ejtise(f,g)  and  thus  e^t  is  a  weaker  error  criterion  for  a  block 
code  than  e(f ,g);  however.  Theorem  1  is  a  converse  and,  consequently. 


a  converse  in  terms  of  lim^sup  e^  is  a  stronger  statement  than  a 
converse  in  terns  of  an  average  error  for  blocks  of  length  r: 

®t,r“  £  JjCj* [(^ir-t-l* •  •  **(i-*-l)r)^  fcr+l* °  *  ^(i+ljr)] 
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Channel  capacity  C'  for 


that  is,  lim^sup  e^slira^sup  Tt^r,  rsl,2,...  • 
the  class  of  codes  ({*nj»(gk})  ^or  a  R**C  (B,B,p)  is  thus  defined  by 

C '  =  sup jRsinf  (lim^sup  e^ }  -  0  'l .  ( 2  .$ 

1  ffn]  *  with  Rd— Rj 

It  is  easy  to  see  that  for  a  code  ({fn}»  {g^})  which  corresponds  to  a 
block  code  (f  ,g),  lim^sup  e^^e(f,g)  and,  since  Rd  is  equal  to  the  block 
code  rate,  it  follows  from  (2.1i)  that  C'»C. 

The  main  result  of  this  section  follows  from  Theorem  1:  C' ^ C 
C'=C. 

Theorem  1.  For  a  DMC  (B,B,p)  with  block  coding  capacity  C,  let 
(jfn*AK^n^Bj ,  (gksl3N^-^-^A j)  be  any  code  with  decoding  rate  Rd.  If 
R^C  then  there  exists  a  positive  number  o<(C/Rd)  which  depends 
only  on  C/Rd  and  not  on  (fnJ ,  |gkJ  or  (B,B,p)  such  that  lim^inf 

Lemma  1.  For  all  t,  1  -  N(t)C  <^5^)+ e*. 

t  log |AI  t 

Proof  of  Lemma.  From  (1.2) 

J ((Xi,... Xj, (Yi,... IN(t)))  SH(t)C  => 

1  l(x1,...Xt)  -  N(t)C~  1  il(Xi/?i,...YN(l)) 

t  t  w  i — 1 

From  I(X^,...X^) — tloglAI  and  Fano's  inequality  (1.3) 

1  -  N(t)C  <  1  fl  £h  (P  [X^  Xjl )  +  etlog  I  All  ^ 

tlog lAI  log |AI  [t  f=l  ^  J 

h(e^)+e^  since  h(x)  is  convex  and  loglAI  —  1.  This 

completes  the  proof  of  the  lemma. 
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Proof  of  Theorem  1.  first,  by  lean  1 

lim.inf/l  -  N(t)C  \-l  -  lim.  sap  »(t)C  =1  - 

V  nsm)  t  tloglll 

111  h(¥^  i )  +  i  where  t'  is  any  subsequence  of  ft]  such  that 
t  * 

h(  e^ i )  +■  e^ i  converges.  In  particular,  for  R^C 
0-*  1  -  C/R^— hClim^inf  e^)  +  lim^inf  e^. 

Now,  let  <*  be  the  unique  real  numbeij  0-*®HL/2 ,  such  that 
1  -  C/^ahCoO+c*. 

Clearly, c(=o(.(C/R<j)*-0  and  lim^inf  e^*<X  so  that  Theorem  1  is  proved. 
If  Rj  had  been  defined  as  linkup  k  log  |JL|  then  Theorem  1 

TO- 


would  be  true  with  11  sup  e^ct  replacing  li^inf  e^*=oC.  The 
proof  is  straightforward  following  the  proof  above. 

There  is  an  interesting  generalization  of  the  decoders  jg^J 
used  here  suggested  by  the  work  of  Blackwell^.  Suppose  that  the 
decoder  can  change  its  past  decisions,  that  is,  suppose  X^  , . . ^ 
can  be  changed  after  N(t)  channel  outputs  where  t»k.  To  make 
sense  from  a  communication  standpoint  the  first  k  decoded  source 
outputs  should  be  changed  only  a  finite  number  of  times  with 
probability  1  for  each  k.  This  point,  however,  will  not  be  needed 
for  the  converse  (Theorem  2)  below. 

Let  be  any  sequence  of  functions  where  the  same 

assumptions  are  made  for  {N(k)}  as  before.  (The  superscript  *  will 
be  used  to  denote  a  decoder  (g£)  which  can  change  its  previous 
decisions. )  Let 


(2. 1C 
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(2 


where  (isk)  represents  the  decision  for  the  i^*1  source  output 
after  N(k)  channel  outputs*  For  the  decoders  previously  used, 

^ik=  ^ik,=  *i  ^or  k,k’si,  that  is,  no  changes  in  the  decoded 
outputs  were  made*  A  code  ((fn) »(g*})  Is  illustrated  in  Figure  3* 

As  before,  the  rate  of  a  decoder  [g*J  will  be  defined  by 

RJ=  lim,.inf  k  loglAl.  (2 

d  k  ra* 

Theorem  2.  For  a  DMC  (B,B,p)  with  block  coding  capacity  C  and  any 
code  ({fn},  [g*  j)  with  R^=*G,  lin^inf  e*ad>0  where  e*  = 
i  hi  and  d.  is  defined  by  (2.10) 


Lemma  2*  For  all  t,  1  -  N(t)C  <  h(g£)  + 

”  t  loglAl 

Proof  of  Lemma.  As  in  the  proof  of  Lemma  1, 

1^I(X^,***X^)  —  l(Xi  ,  ».»X+/Yi  js  W(t)C  > 

t  t 

t 

1  I(X1,...Xt)  -  N(t)Csl 
t  t  t  i=l 

(In  the  proof  of  Lemma  1  the  sum  on  the  right  hand  side  of  the 

t 

last  inequality  was  1  ]T  l(Xj:/Y^, •••%(!))•)  From  l(X^,...X^)  = 


tloglAI  and  Fano's  inequality 

1  “  N(t)C  <  1 

t  loglJkl  loglAl 


[i  xi])+  St10®!^ 


h(e£)  +  e£  since  h(x)  is  convex  and  loglAl  —  1*  This 


•Im¬ 


proves  the  lemma* 


I1>*,,XK(1)»,,,XK(N(1))*,“XK(N(2))» 


%(1) 


fN(2) 


Yl» . YN(l)> . XN(2)»- 


,XK(N(3))' 

fN(3) 

1 1 

,yN(3)» ••• 


source 

outputs 


channel 

Inputs 


Y  1 

YN(  l)  f . YN(2)» . YN(3)f .  channel 

— - - 1  |  outputs 

* 

*1 

®2 

e# 

g3 

(X12 >^22)  (X13 ,X23 5X33) decoded 

source 

outputs 


Figure  3*  Encoding  and  Decoding  Relations  for  a  Code  ({fn}»  (g£])« 
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o 


The  proof  of  Theorem  2  proceeds  from  this  point  exactly  as  In 
Theorem  1* 


It  can  be  concluded  from  Theorem  2  that  the  channel  capacity  C* 
for  the  class  of  codes  ((^n}>{stc))  >  wher® 


C*  =  sup 


R*  infflim^sup  w£}=0 

({rn)>M)  ^  “d2* 


>> 


is  equal  to  C. 

In  both  Theorems  1  and  2  no  use  was  made  of  the  encoder 

In  fact.  Theorems  1  and  2  hold  for  any  function 
f»A^-*B*  which  is  (d(A*),<j(B*)) -  measurable.  What  is  needed  is  the 
definition  of  the  probability  distributions  p(x^, . . .x^;y^, . ..y^), 
i,j=l,2,...  for  the  general  f,  since  then  j((X1,..*X^),(Y^,...?j)) 
is  defined  for  all  i, j  and  is  s jC.  From  this  point  the  proofs 
go  through  as  before.  The  definition  of  pCx^,...^}^,...^)  for 
f»A^-»B*,  (cr(AI),0<B1))-  measurable  is  given  in  the  appendix* 

The  rate  R^  of  the  encoder  [fn]  is  defined  by 

Rgslii^inf  K(n)  log|AI  * 
n 

If  the  source  subscripts  correspond  to  time  in  seconds  then  K(n) 
is  the  time  in  seconds  the  channel  input  is  transmitted  (assuming 
zero  delays  in  the  encoding  equipment)  and  K(N(k))is  the  time  in 
seconds  the  source  output  is  decoded  (again,  assuming  zero 
delays  in  channel  transmission  and  in  the  decoding  equipment). 

One  natural  requirement  for  any  code  is  that  of  bounding  the  time 
lag  (positive  or  negative)  between  the  time  an  output  occurs  and 
time  it  is  decoded,  that  is,  require  sup|x(N(k)) -k[«oo. 


-16— 


Theorem  3. 


For  a  code  ({fj,  (gj)  with  8up|K(N(k))  -k|-=0D 


Re*V 


Proof.  Bup|K(N(k))  -  k|-t=oo=>  lim  Kfa(k))  «  1  =*• 
-  k  k-K»  k 

lim  K(N(k)).  N(k)  =  1. 
k-KD  N(kJ  k 

Pick  a  subsequence  k!  such  that 


k'  k'  Rd 
N(F)  *  log|A| 

so  that  lim  K(N(k'))  exists  and  equals 

k’-^D  nTFT 


a  ^  e 
log|A|  log|A| 

This  completes  the  proof. 

From  the  proof  it  is  imnediate  that  Theorem  3  remains  true 

if  supl  k(n  (k))  -k|«CO is  replaced  by  the  weaker  condition  lim  k(n  (k))  =  1. 
k  k-^O  k 

Since  both  Theorems  1  and  2  are  weak  converses  the  next  question 

to  ask  would  be  whether  et~^l  or  5*— ^1  (the  strong  converse  statements) 

for  R  =»  C.  Since  one  could  guess  each  source  output  correctly  with 
d 

a  probability  of  at  least  l/|A|,  regardless  of  the  decoding  rate  Rd>  it 
follows  that  "e^l  -  l/l  A I  for  all  t  for  at  least  one  decoder  (g^  . 

Thus  there  is  no  strong  converse  statement  in  terms  of  e  ^ .  Similar 

* 

remarks  are  true  for  e^. 


Strong  converses  can  be  obtained  for  a  different  probability  of 


error  as  follows. 


For  a  code 


( {f n },  j  one  may  consider  the  encoding  of 


t  This  fact  was  pointed  out  to  the  writer  by  Professor  D.  Blackwell. 
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(Xj, ...  Xjc)  to  (Xi,  ...  Y^k))  to  be  random,  that  is,  if  x(N(k))  =»  k 
then  what  (xj_,  ...  xk)  is  mapped  into  (i.e.,,  what  sequence 
(yi>  •••  yN(k)))  depends  on  Xk+^.,  ...  XK^N^kjj  .  ^If  K(N(k))sk  there  is 
no  question  of  random  encoding  since  Xj_,,  ...  Xk  determines  Yi,  ...  X(j(k)') 
Since  the  strong  converse  for  block  codes  continues  to  hold  when  the  en¬ 
coder  is  random  the  following  result  is  obtained. 

Theorem  For  a  BMC  (BP  B,  p)  with  block  coding  capacity  C  let 
({fn)>[gk})  be  a  code  with  Rd»C.  Then 

_n 

t  , 


et  =  P 


J^[si(*L‘‘  •••  %i))  £  xijj 


The  same  argument  applies  for  the  redecoding  case. 

Theorem  5»  For  a  BMC  (B,  B<,  p)  let  (jfn],  (gj ) )  be  any  code  with  R^j  =»  C. 
Then 


et*=pf~*< 


Yl-‘>  0°‘>  %(t)j  ^  (*!••  000 


The  codes  [[fnls{gk}}  Theorem  1  can  be  changed  for  use  with  a 

doubly  infinite  source  iXj_„  -oo  ■<  1  «  oo  T  An  encoder  becomes  a  sequence 

/  1 

of  functions  ifr8  1  i  A  i  — *  B  ’  where  it  is  required  that  each 

‘  **  -*  '  XXa) 

tn(-a>-^n-<co )  depend  on  at  moat  a  finite  number  of  coordinates  of  IT  Ai  • 

i= 

The  sequence  {K(n)j  wi.il  be  a  sequence  of  integers  such  that  K(n)  is,,  for 
each  ns  the  smallest  integer  r  such  that  ail  channel  inputs  up  to  and  in¬ 
cluding  the  n^b  do  not  depend  on  the  source  outputs  (X_^- iT+2f  •  A 

f  N(k)  I  ' 

decoder  is  a  sequence  of  functions  jg^a  |  j"  ^B^— Aj  where  it  is  assumed 
that 

(i)  N(k)  is,  for  each  k,  the  smallest  integer  r  such  that  all  decoded 
outputs  1}=  gii^DCi)5  %(i))  UP  “°  and  including  the  kth 
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#■«* 

do  not  depend  on  channel  outputs  (Xj^p  •••  '»  •Qd 

(11)  D(k)  s  M(k),  D(k)  a  D(k+l)  for  all  k. 

=  llBjlnf  log  |AI  Kill  be  called  the  rate  of  the  decoder 

Ih>  average  number  of  bite  decoded  per  channel  output 
between  the  t^  and  k^1  decoded  outputs  (t-<k)  is 

(assume  N(k)  =*■  N(t)) 


(k-t)  loglAI 
N(k)  -  H(t) 


and 


Him  inf  (k-t)  loglAI 
r  »(k)  -  H(t) 


Rd  for  all  ti-oo-«t-«a> , 


that  is,  the  same  rate  Is  obtained  independently  of  which  decoded  source 
output  is  used  as  a  starting  point* 

The  analogous  statement  of  Theorem  1  for  the  doubly^infinite  source  is 
Theorem  6*  Far  a  HC  (B,  §,  p)  with  block  coding  capacity  C  let 
((fn}>  (g^j )  be  any  code  for  a  doubly-inflnite  source  with  R,j  »•  C*  Then 
r+t-1 

limtinf  i  £  *lXi  *  xi]  -  ^Ad)^0 

for  all  rt  — oo «=  r  «oo  share  a  is  defined  by  (2*10). 

3*  For  each  rt  -oo  «  r«  oo  and  for  all  t»l«t-«aD 

>-Jsrar!!L-“(i 


•i 


b-1 


P[fi  4  Xj 


The  proofs  of  Lemma  3  and  Theorem  6  offer  no  difficulty  following 
the  proofs  of  Lemma  1  and  Theorem  1,  and  thus  are  omitted. 


-19- 


in.  OHIFOIBI  rniTMKDRT  cents 


Per  this  section  {Xjjwill  denote  a  doubly-infinite  source 
[l^,-axdcoo]  and  all  codes  discussed  will  be  for  such  sources. 

To  begin,  suppose  the  of  section  II  takes  almost  the 

simplest  form  possible: 

fQ  =  f^tA^B  for  all  n:  -co«=n«=oo  , 
gk=g(m>:SnUA  for  all  k:  -oo«k«=oo  , 
with  {lC(n)]  =  {n)  and  {H(k)J=  (k  +  m  -  l). 

The  code  is  illustrated  in  Figure  It.  YA,...Ym  are  used  to  decode 
since  in  the  encoder  are  the  only  channel  inputs  which 

depend  on  X^  and,  intuitively  then,  should  be  the  most 

important  channel  outputs  for  decoding  X^.  Of  course,  using  other 
channel  outputs  will  improve  the  probability  of  deciding  X^  correctly 
but,  for  the  present,  only  Y^,...Ym  will  used.  Because  of  the 
stationarity  involved  with  the  above  code  it  is  clear  that  P  XjJ 
is  the  same  for  all  i  so  that  it  suffices  to  restrict  attention  to 
X^  when  discussing  the  probability  of  error  for  X^.  The  following 
questions  arise.  Mien  can  P[lj_^  Xj-^O,  that  is,  when  does  there 
exist  a  sequence  |(f^®),g^B^)j of  codes  of  the  above  form  such  that 
P[l]^ xJ-S»0?  What  if  the  n-extension  (B^S^q)  is  used  instead 
Of  (B,S,p),  that  is,  when  does  there  exist, for  some  fixed  n,  a 
sequence  of  codes  ,g(®)jj(where  for  n=l,2,... 

f(m)lAm_Bn, 


•  • 


•  •  • 


v 


source  outputs 


V  *1*  *2 

i  X1  - 


fW 


f(m) 


r(m) 


r(o) 


Y0»  ^l>  ^2*  **•  ^ 


n» 


channel  inputs 


channel  outputs 


•  •  • 


V  *i» 


...  decoded  source  outputs 


Figure  I4.  Coding  Relations  for  the  Code  (f^m^,g^ra^)  of  (3 -l) 


such  that  pflj./  g^m^(T-|_,  •  ••Tnm)|  =  P^i  ^Xi]-^*0?  Before  giving  partial 
answers  to  these  questions  certain  properties  of  the  above  codes  will  be 
singled  out. 

f  K(n2 

An  encoder  ^fn*  |  | 
l^oo 

memory  encoder  if  there  exists  a  positive  integer  m9  such  that  the  fn,  for 
each  n.  depend  on  at  most  the  last  m9  coordinates  of  ( . .  »I^r . .  n) ) . 

The  smallest  such  ms  will  be  denoted  by  m  and  called  the  duration  of  memory 
of  |fnj.  The  corresponding  definition  for  decoders  is  obvious. 

An  encoder  (fnj  will  be  said  to  satisfy  a  uniform  timing  constraint 
(ke>ne>se)  if  K(n)=j—  k0+  se  for  all  n.  Here.  kg?  nQ  and  se  are  integers 
with  ke  and  n0  positive,  and  [a*]  denotes  the  smallest  integer  =:a.  The 
corresponding  uniform  timing  constraint  for  decoders  will  be  denoted  by 
(kd,nd,sd)where  N(k)  =  [jL  ru-fra*  fw  ail  k. 

[*dj 

The  natural  requirement  for  an  encoder  ( f..  1  with  memory  duration 
m  and  uniform  timing  constraint  (k  ,ne.,se)  is  that  the  functions  fn 
themselves  satisfy  a  uniformity  requirement.  An  encoder  [fRj  with 
memory  duration  m  and  uniform  timing  constraint  (k_Jt’  )  will  be  called 
a  uniform  finite^memory  encoder  (UFUB)  if  there  exists  a  function 
f sA^-^B0®  such  that  the  fn  are  periodic  with  period  and 

fj^  =f(*)(j)  for  j  =  1.0..^.  (3.2) 

where  f(*)(j)  denotes  the  component'  of  f.  As  before.,  a  similar 
definition  is  obvious  for  uniform  finite-memory  decoders  (UFMD). 

For  everything  that  remains  it  will  be  assumed  that  k  »k  *k. 

6  Cl 

n0=  n^=  n,  and  s0=  C.  (The  positive  integers  k  and  n  used  here  have 
no  relation  to  the  dummy  variables  in  N(k)  and  K(n)°)  In  addition, 


Aj— ^B,-co  - 


will  be  called  a  finite- 
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the  memory  duration  of  a  Ufl£E  and  a  UFVD  mill  always  be  m'k  and  m'n 
respectively  for  some  positive  integer  m'  •  At  the  risk  of  confusion, 
the  superscript  '  sill  be  dropped.  For  a  given  m,  s^  mill  be  taken  as 
(m-l)n.  Because  k  and  n  mill  almays  be  understood  a  UFVE  mill  be 
denoted  by  f^  and  a  UPMD  by  g  n  idiere 

and  g<">,S“Uik.  0.3) 

The  code  (3*3)  corresponds  to  the  code  (3*1)  for  k=  1,  n =1. 

Notation  for  a  uniform  finite-memory  code  (UF1G)  (f^,g(m) ) 
can  be  greatly  simplified  to  correspond  to  that  of  code  (3*1).  Let,  for 

Ui=  (x(  i-l)k+l*  *  *  *Xik'  >  ^i=  (*(  i-l)k+l»  *  *  *^ik^ 
vi=  (T(  i-l)nfl»  *  •  *Yin) »  ^i=  i-l)nfl»  *  *  *^in)  * 

V Wi>  <3-w 

for  -co«icoo.  The  diagram  of  Figure  $  illustrates  the  relations  for  a 

UF1IC  (f^m^,g^m^).  Note  that  the  encoding  and  decoding  rates  for  the 

above  UFMC  are  both  equal  to  il  loglAl,  and,  as  before  with  code  (3*1), 

n 

P  [0j/  Ui]  is  the  same  for  all  i. 

Particular  cases  of  uniform  finite-memory  encoders  are  quite  prominent 

in  coding  literature.  For  the  case  of  a  binary  source  and  channel  Elias ^ 

calls  f^  a  convolutional  encoder  if  f^m^  is  a  linear  function  (in  the  bi- 

km  n  r  i 

nary  field  sense)  from  B  to  B ,  B=  (0,1 j.  For  other  sources  and  channels 
f^m)  is  called  a  sequential  encoder  by  Reiffen?  when  it  is  linear  in  the 
sense  of  some  finite  field  appropriate  for  both  source  and  channel.  De¬ 
coding  for  these  encoders,  however,  is  quite  different  from  that  done  here. 
For  example,  the  decision  for  (U^)  is  always  made  from  V^,.. 


all  i, 

Then 
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k-tuples  of 
source  outputs 


n-tuples  of 
channel  inputs 


|Vq,  ^2 9  c‘>°  j  ^m3  ••• 


n-tuples  of 
channel  outputs 


j03  ul3 


k-tuples  of 
decoded  source  outputs 


Figure  Coding  Relations  for  the  UFlffi 
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puts  0  a)g. •  •  *0g  are  correct.  ^  is  ttasa  dstendned  fires 
sad  0  ssstadag  0^  w  corset  and  so  on  for  tfj,  . . .  •  Tbs 

sbole  coding  procedure  is  sade  into  s  block  cods  by  periodically  potting 
in  *>1  loosen  sooroe  eyehole  at  appropriate  intervals. 

To  return  to  the  questions  at  the  beginning  of  this  section  one  aay 
ask  4ist  is 


sup 

k,n,IAI 


|log|A|t 


(3.5) 


CQ  is  called  tbs  uniform  finite  —cry  ceding  oapaclty  of  (B,S,p),  Be¬ 
cause  it  is  required  that  the  error  go  to  0  ulth  a -mb  ,  k  and  n  fixed, 
Cu  is  not  Just  the  expression  far  €•  restricted  to  the  class  of  unifoni 


finite  —cry  codes. 

Let  denote  pft^  Dj  *mo  a  IBIS  f^  and  a 

BFMD  g^  are  used.  Beoause 


for  all  i  with  a  BK  (f^,g^)  it  feUess  fron  Theorem  6  that  for  any 
m  and  (f^,g^)  with  H*j[  logiAlsC 

Di/f(*),g(*)]  >oi>0  (3.6) 

share  o<*  o<(C/k)  is  defined  by  (2.10).  CU*C. 

In  the  appendix  a  stronger  fern  of  the  above  statement  is  proven 
idilch  alines  the  decoder,  for  a  UFBB  f(*),  to  use  all  «1>— — 11  outputs 


Unfortunately,  what  Cu  is  for  a  EMC  (B,S,p)  ia  an  open  question.  Tbs 
next  few  paragraphs  are  devoted  to  exhibiting  a  class  of  channels  for  which 


Denote  the  ainiana  probability  of  error  ever  all  ways  of  deciding  Uj_ 
free  (Ua,. . .Ubj'^c,...Vd)  (»sb,esd)  when  a  UF1B  f'*'  is  used  by 
e(Ui/f<*>jUa,...UbjVc,...7d).  Also,  let  eCUj/f^jV,.,...?*)  be  the  ainiaua 
probability  of  error  over  all  ways  of  decidixig  froa  (TC,...V^).  por 
eranpla,  far  c=  1  and  d=e,  8(u^/f^a^7j_,...T>)=1  Ui/f^**^11^)  « 

Beiffen^4 ^  has  shown  that  far  R=  «j  log  1X1*=  C,  there  exists  a  sequence 
of  UFIB' s  {f^iaal(2,...)  such  that  for  all  n 

•^/f^  i !  1;  V •  ••V-^*)2""nE(R)  ( 3*  7) 

Miere  (?(■)  is  a  polynonial  in  a  (whose  coefficients  depend  an  n)  and 
E(R)=»0  for  all  R<C*  Before  aaHng  use  of  (3*7)  three  lamas  are  given 
below,  the  first  two  of  Miich  are  routine. 

Ieaaa  U«  For  a  UFMS  f^*^  and  positive  integer  r  (a>2) 
e(Uj/f(M);U_B_r+2>  *  *  *D-r+l*7-r+2»  •  **7m)  — 

r  e(Uj/f  jU_-f2»  •  .  •  *?B)  • 

Lome  5*  For  any  finite^valued  randoa  variables  X  and  T  let 
e(lA)=aia_P^(T)  f«X]. 

Then  for  any  sequence  (TjJ  of  finite-valued  randoa  variables  with  the  saae 
finite  range  B 

f  •(A/(7^,T 2****^i  * 
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i diere  •(l/lpl2>***))  *  P[g(Ti»l2****  )J<I] 

(fSf^xJetKB*)  far  all  x) 

let  •{V1MU^  i  —.t1s  **\y » 

laf|p[i(*.Jr1##.*fJt)|«U1]  when  f<m)  is  tbs  encoder  j. 

®i)  r®r  all  u) 

Is— a  6>  Suppose  thsrs  exists  a  sequence  of  W  s  {f£*^}  iMh  that 

e( ^/f . V1S . .  oV,)  — Oo  Than  thsro  exists  a  sequence  {f'*^}  of 

unB>s  such  that  e(01/f^Y1,..*Vi)-5*0. 

Proof.  Let  a,^  B^n  •(%/f'^B  M* 

(all  OFlg8  s  f 8 1  * )  of  a  as  ary  ■) 

Than  (i)  S,rsS,rfl  Vr  f  Vs  •<  Vf *  *  *  *  V  *** V 

Those  wtatenenfcs  follow  ft—  Is— a  2  and  the  definition  of 

Vr* 

<U>  Vr^*bflsr* 

To  prove  (ii)  notice  that  for  4*^,  the  UV1B  whioh  Jidda 
a^r«  o(Vfr*)^n-r+lf>00<\)i  *■  defined  by 

sorely  advances  the  Vs  s  one  coordinate  so  that  the  probability  distri¬ 
bution  of  (UlSVr+i*  •••**)  with  fM  is  the  sane  as  OtyV,«e*> ••«***) 
with  h(*flU>  ^1^ss(DiA(^1>ST*^2^*‘Vl)*Vr- 

Trm  (i)  and  (i±)  it  follows  that  a^-io^the  sef—e  {4*^}  * 
DUB'S  has  e  tv*?’  S?1,...TJ|l)-^*0.  This  co— lotos  the  proof. 

Pros  le— a  6  whenever  there  exists  a  seqnenee  jf^ j  sash  that 
eOJj/4"*  I  • » .Tp  - .  °  then  there  exists  a  eefaenee  {( f ^ )}  ef 

QnB's  sneh  that  fpy*  Uj/f^*!^]-^®.  A  olass  ef  ah— els  will  a—  ho 
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given  for  which  «(®i/V  * for  bom  aequouos 

Th*  following  definitions  will  be  needed,  The  product  of  tk> 

MB's  (B,B,p)  and  (B»,S*,p»)  is  th*  DC  (BxB',®xB»,q)  stars 
«(f»f,/7»j,  )«p(y/y)p,(3r,/y*)  fw  ell  y,  y,  y'»  f'.  A  binary  sy— trio 
ehsnnol  (BSC)  has  B=  B=*  |o,l]  with 

p(o/o)=p(iA)=  q» 
p(o/i)=p(Vc)=  p, 

•nd  p+  la 

BBC  (p^q^  denotes  •  BSC  with  4"  q^p*  p0.  The  k-ivpls  amn 
ehsnnol  accept*  k-tuplsa  of  0's  and  l's  and  orasaa  tho  tails  k-tuple 
with  probability  p» ,  or  lata  it  through  oorroetly  with  probability  q' » 
Consider  a  channel  which  is  tho  product  of  the  n  erteaalon  of  the 
BSC  (p0,qo>  end  the  k°tupla  erasuro  channel  with  Suppose  the 

encoding  of  a  {o,l}  source  to  the  BSC  (p^q^  Is  dona  tgr  a  M  f^ 

(an  ol—nt  of  a  sequence  satisfying  (3* 7))  and  the  encoding  of  tho  eouroo 
to  tho  oraouro  ohennal  is  done  by  uaii^  the  k-tuple  of  a  euros  dlreetly  as 
sn  erasure  channel  Input,  A  diagraa  la  given  below  (Figure  6), 

If  the  product  channel  Inputs  are  denoted  ty  Ti»(U1,Tj)  it  la 
dear  that  the  encoder  given  by 

Tl“  (  *  **Ui^) 

la  also  a  OB  f^*^  of  naaory  duration  s  for  the  prednct  cbm mi 1 
Secodlng  frew  the  sequence  (— f},— 9*)  is  coom+UMbci  aa 
follows.  Starting  at  the  aero  coordinate,  n-1  ooweooutlve  imernsad  Fa 
«*w  sought.  Scqppoee  that  P  r(M  1  la  tho  first  nai  of  a-1  P*s 

not  orased  (rzn-l).  By  lava  3*  Si  can  ba  decldad  using  only 
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k-tuples  ef  Mnrot 
output*  (0*8  end  l't) 


n-tuples  of  Input* 
to  the  BSC 


...  vi,  ?i,  Vi,  ...  v;. 


product  channel  inputs 

Yi=(Di»,i> 


...  V£,  Vi,  l%2»  ...  V^,  ...  product  ehaonel  outputs 

^  =  (8^,1^)  whun  V£  dsnotos 

erasure  chsnnsl  output  fir 
input  0^ 


Figure  6.  Encoding  Relations  for  the  Product  of  the  n-extensien  of 


the  BSC  end  the  Ic-tuple  Erasure  Channel 


V  *ith  a  probability  of  orror  *■ 

(r-*f2)(K*)2'*nl5^l  let  jyr  be  the  probability  of  finding  n-1  con¬ 
secutive  Q'«  not  erased  for  the  firrt  tine  at  coordinate  -rd»r"i,2,..«  . 

Than  •(  o*^  *  •  *  ♦*!>  *  *  •fi^s  ^(r-^-2)l^r 

-  £  rp_  r6<n)2"!>D*  (assune  n=s2) 

*  (Feller^,  pace  300  with  Fellar'o  p,q  and  r  replaced  by  and  n^L 

reapectiTely) 

Ao  if  [nB-klof  l/q]»0. 

Since  only  nonpositive  coordinates  of  core  used  to  find  n~l 


consecutive  erasures  one  can  vita 


•(  vf — v s  (^*)^e2""  ^awc 104  ^ 


Braining  the  coefficient  [n*(R)  -klog  l/d]  of  •«  it  is  clear  that 
if  it  is  creator  than  aero  for  sons  kQ,n0  with  than  it  la  greater 

thaw  aero  for  all  k,  n  with  k/n  ■  K*  For  the  ease  here,  R  is  the  rate  for 
the  BSC  so  that  &«?l-h(p0)*  (The  rate  for  the  product  channel  is  just  k*) 
By  selecting  a  q  sufficiently  class  to  1  depending  an  R«  Htp,)) 

there  exists  a  sequence  |f^j  such  that  e(  Uj/f^  j  .  •  .^)  ——  0* 

Once  q  is  fixed  k  can  be  selected  such  that  Rm  k/n  is  fixed  and  the 
block  coding  channel  capacity  kq*  of  the  erasure  channel  is  arbitrarily 
snail*  Thus  one  has  the  situation  of  being  able  to  nake  a  product  channel 
fron  an  n-extension  of  the  BSC  (block  coding  channel  capacity  n[l-h(po)] ) 
with  an  erasure  channel  of  arbitrarily  snail  block  coding  capacity  and 
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still  bars  a  sequence  ( f^j  -with  R=  k/n  such  that  $•••  C*i* 

-io. 

The  same  daaonstration  can  ba  carried  out  for  a  general  BHD  (B,  B,p) 
and  source  {ijj  if  the  k-tuple  erasure  channel  takes  k-tuple« 

(xp..«x^),  XjCl  as  inputs.  The  block  coding  channel  capacity  of  the 
erasure  channel  is  then  kq^log  A  where,  as  before  q*  is  the  probability 
that  any  k-tuple  goes  through  unerased. 

Aside  from  calculating  CQ  for  a  given  BHD  it  wouJd  be  nice  to  know 
if  Cn^  0  for  a  channel  which  does  not  have  the  erasure  properties  used 
above  (for  ample,  is  0  for  the  BSC).  It  would  also  be  useful  to 
have  an  exaeple,  if  possible,  far  which  CU«=C. 

Another  question  arises  free  the  fact  that  Iewwa  6  does  not  provide 
any  inf  creation  about  how  fast  P[fJi*  Ox/f^,g^]  tends  to  0  with  a. 

For  exaaple,  when 

•(  v2/4m)}~j rlf.„rm)s  */*], 

and  hence  tends  to  0  exponentially  with  n,does  e( 0^/ f ^  . . . V^)  tend 
to  0  exponentially  with  n?  More  generally,  one  would  like  to  know  how 

auch  swall or  e(U^/f^  m)  J^-r»  +2*  ***^efr' '  -1>  U  than 
s(01/f jfj, .  ..'V>)  for  r«,r"=*  1,2,...  . 
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In  this  section  the  probability  distributions  p(ij,  •  •  • .  .Jj)  , 

ljja>l,2,.»*  are  defined  for  a  function  which  Is  (o(aF),o(B *))-* 

Measurable*  Terns  ased  can  he  foaad  in  Loere»^ 

For  a  BHC  (B,S,p)  let  p(/fr)  he  the  product  probability  on  flf®1) 
determined  by  the  probabilities  p(*/yj)  on.  i=l,2,...  ,f  — 
(yi,y^,..*>.  Then  p(*/*)  is  a  ofB1)  -  neasurable  probability,  that  is, 
(i)  p(*/f)  is  a  probability  on  <r(S*)  for  each  y€  l1  and 
(ii)  for  each  aet  §€  (K^t)»  p(9/*)  is  •  ®(B^)-  neasurable  fsnetinu 
For  a  source  {ij  and  a  function  f*!?-*#1  which  is  (of A1), (KB1)) - 
neasurable,  the  flection  p(*/f(*))  is  a  (KA1)-  measurable  probability. 
FTcn  p(‘/fO)  a  probability  Q  is  defined  on  <KA*)x<KB*)  hy 
Qt(W)=  JT  p(^(x)/fC*))  dp,  W€  oUfyffd1) 

where  p  is  the  probability  on  <KA^)  corresponding  to  the  source 
variables  being  independent,  identically  and  uniformly  distributed 
and  \7(x)  is  the  section  of  V  at  x.  All  probabilities 
ere  then  just  the  marginal  probabilities  of  Q. 

-It  can  be  shown  that  the  probabilities  p(x^,.o<o^jy^,*..^) 
thus  defined  are  the  ease  as  these  need  in  section  U  rina 

i 

is  given  by  the  encoder  jfn»A^n^-*-ljo 
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Lrb  f(»)  be  any  TJfUK  for  (lx)  and  (B,B,p)  inch  that  feOtfOlog  |AI»C. 
Then  for  any  function  g*^l  such  that  g’"^*(tt)e 

»[«(...  ?-i,  »».  Vx,  ...)  ji  Ul]^<X(C/k)* 


far  all  a 


Proof.  For  any  positive  integers  r  and  ta 

I(^l»  •••  ^t)  ”  l(^x>  •  ••• 

Since  I(Ux»  •••  %)  s  tkloglAI 

it  follow  KUx,  ...  V^-r+1*  •••  ^r+t)^  tkloglAI  -  (2r*t)nC 
Mo*ry  I(^l»  •••^t/^-r+1*  ***  ^r»t)  —  ^  ^r4t) 

— ^  I(Oi/^-r+i>  **•  W) 

=  t  KUxA^x,  ...  v^x) 

since  the  probabilities  p(uxJT_^rfx»  ...  Vy+x)  are  the  sane  for  all  i. 

.  * .  KUx/V^x,  . ...  Vr+x)  s  k  log  |  A|  (l  -  ^  | ) 

far  all  positive  integers  t  and  r  ==> 

KUi/7.^1,  ...  V*.x)  ^  kloglAl  (l  -  |) 
for  all  positive  integers  r. 

r  ^  ^  ^  n 

Because  *°#  ^r4-l)  i  ^0*  ^1*  eo-o  )  it  follow 

KOlA.^x,  v0,  Vi,  ...)2  kloglAld  -  c/R) 

Since  klog|A|  si  it  follow  ^mediately  from  Fane's  inequality  (1.3)  that 
P|g(*,«'^Q»  ^1*  *®*)  ^  X?xj  4*  h^P|g(...  Vo,  Vi,  .«-•)  —  1“C^ 

so  that  P[g(...^Q»  ^i»  •••)  ^  UxJsa(C/)i)  idiere  oris  defined  by  (2.10). 

H 

Fano's  inequality  ( see (1.3))  was  assumed  (end  can  be  proven)  for  the  ease 


,  ,00' 7 — r -  CO 

Jl^i)  i*  the  (T-field  of  subsets  of  J~£®i  determined  by  cylinder 


seta. 
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Z  1 a  fiaita  wlMd)  I  ia  to  irbltnrjr  raata  rariabla  aad  giB—HL 
la  Minrabl*  on  tho  rang •  of  I  (that  la,  ^Hx)«  o— flold 
mwjrkto  for  tho  rang#  of  X,  xcl)«  I(l/l)  la  dofinad  far  this 
oaao  aa  tho  aapoctafrloa  of  tho  ruta  varlablo  aLooat  aarol j  dafinod 


KX/r)  *  -log(?([u«]A))«i  tho  oat  [Sac] 
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